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ABSTRACT 

PvuRtsll is a prototype for a larger family 
of restriction endonucleases that cleave DNA 
containing 5-hydroxymethylcytosine (5hmC) or 5- 
glucosylhydroxymethylcytosine (5ghmC), but not 5- 
methylcytosine (5mC) or cytosine. Here, we report 
a crystal structure of the enzyme at 2.35 A reso- 
lution. Although the protein has been crystallized 
in the absence of DNA, the structure is very infor- 
mative. It shows that PvuRtsll consists of an N- 
terminal, atypical PD-(D/E)XK catalytic domain and 
a C-terminal SRA domain that might accommodate a 
flipped 5hmC or 5ghmC base. Changes to predicted 
catalytic residues of the PD-(D/E)XK domain or to the 
putative pocket for a flipped base abolish catalytic 
activity. Surprisingly, fluorescence changes indica- 
tive of base flipping are not observed when PvuRtsI I 
is added to DNA substrates containing pyrrolocyto- 
sine in place of 5hmC (5ghmC). Despite this caveat, 
the structure suggests a model for PvuRtsI I activity 
and presents opportunities for protein engineering 
to alter the enzyme properties for biotechnological 
applications. 

INTRODUCTION 

The mechanistic basis of modification specific DNA bind- 
ing and - in some cases - cleavage has attracted much 
interest. Based on experimental structures or confident 
homology models, we now have a detailed picture of 
5-methylcytosine (5mC) specific binding by MBDl (1), 
MBD2 (2), MBD4 (3), MeCP2 (4), Kaiso (5) and the repH- 
cation fork-associated UHRFl (6-8). There are also struc- 
tural data about 5mC specific enzymes: McrBC has been 
crystallized in complex with DNA (9), and for MspJI a very 
informative structure in the absence of DNA has been de- 
termined (10). Based on these studies, the methyl binding 



proteins/enzymes can be divided into two broad groups, de- 
pending on whether they recognize the methyl group in the 
context of double stranded DNA or whether they flip the 
modified base to scrutinize it in a dedicated pocket. MBDs 
(1-4), MeCP2 (4) and Kaiso (5) interact with the modified 
base in a Watson-Crick pair. In contrast, UHRFl and most 
likely also MspJI share a so-called SRA (SET and RING as- 
sociated) domain that flips and accommodates the modified 
base (6-8,10). The same is true for McrBC, even though the 
flipped base binding domain is in this case not homologous 
to the SRA of UHRFl and MspJI (9). 

The presence of 5-hydroxymethylcytosine (5hmC) in 
phage (11) and mammahan DNA has been known for a 
long time (although the initial estimates for the amount of 
5hmC in mammalian DNA were too high). Much recent 
research was triggered by the identification of the function 
of TET (ten-eleven translocation) proteins as 5mC oxidiz- 
ing enzymes (12-13), and the role of 5hmC as a demethy- 
lation intermediate (14-15), epigenetic mark (16) and diag- 
nostic marker in cancer (17). Recent pull-down/mass spec- 
trometry studies have also shown that there is a large reper- 
toire of 5hmC binding proteins in vertebrate tissues (18- 
19). Some 5hmC binding proteins, such as UHRFl, bind 
also 5mC, and their interaction with 5hmC can be modeled 
based on the interactions with 5mC (20). Other proteins, 
such as MBD3, which binds to 5hmC according to some 
(21) (but not other (18)) studies, are homologous to struc- 
turally characterized 5mC binding proteins and therefore 
their possible interactions with 5hmC can be deduced (21). 
However, for most other proteins that were identified in the 
mass spectrometry experiments, it is not even clear whether 
the interaction with 5hmC is direct, and provided it is, how 
the 5hmC base is 'read'. It is also still not understood how 
the presence of the 5hmC base can trigger an enzymatic re- 
action. 

The endonuciease PvuRtsll from Proteus vulgaris 
(strain) has been reported to be a dimer (22) like most 
endonucleases that catalyze double strand breaks. It 
cleaves DNA that contains either 5hmC or 5ghmC 
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(5-glucosylhydroxymethylcytosine) bases (23), with a 
preference for a-5ghmC over (3-5ghmC (22). Cleavage is 
most efficient when two 5hmC (5ghmC) bases are present 
in opposite DNA strands approximately 22 bases apart 
from each other (24). PvuRtslI makes a double strand 
break approximately in the middle between the two sites 
(the precise pattern is 5'-CNii_i3j,N9_ioG-3', where the 
arrow denotes the cleavage site and C the modified base) 

(24) . Some double strand cleavage can also be observed 
when there is only a single 5hmC (5ghmC). The potential 
applications of 5hmC sensitive sequencing have triggered 
the search for PvuRtslI homologs that exhibit desirable 
properties for biotechnological use. This search has led to 
the identification of a whole family of enzymes, which differ 
slightly in the distance requirement for the modified bases 

(25) . In contrast to PvuRtslI, some of them such as AbaSI 
show a preference for 5ghmC over 5hmC (25), which can 
be exploited in sequencing by postglucosylation of 5hmC 
with phage T4 glucosyltransferase. All tested members of 
the PvuRtslI family discriminate between 5hmC and 5mC, 
but to varying degrees. As 5hmC is much rarer than 5mC in 
animal genomes (26), very high discrimination stringency 
is required for biotechnological use. Hence, there are at 
least two engineering goals to improve PvuRtslI and/or 
the other family members. First, it would be desirable to 
design an enzyme fully dependent on a single modified 
site only, which should make a double strand break on 
one or both sides of the modified base. Second, it would 
be useful to improve the stringency of 5hmC versus 5mC 
discrimination. 

Here, we report the crystal structure of PvuRtslI at 
2.35 A resolution. The structure reveals an N-terminal PD- 
(D /E)XK domain in agreement with an earlier prediction 
(27) and a previously unrecognized C-terminal SRA do- 
main. Site-directed mutagenesis experiments confirm the 
importance of predicted key residues in the structure. Based 
on the combined crystallographic and biochemical data, we 
suggest a structural explanation for why PvuRtslI requires 
5hmC or 5ghmC bases in opposite strands at a distance 
of just over 20 base pairs for the introduction of a double 
strand break approximately halfway between the modified 
bases. 



MATERIALS AND METHODS 

Cloning 

A codon optimized PvuRtslI REase {pvuRtslI) synthetic 
gene in pTriEx (Ap'') vector was purchased from Mr. Gene 
(Germany). The gene was introduced into pET15bmod 
(ApO, a derivative of pET15b (+) (Ap'') via EcoRI and Xhol 
restriction sites, resulting in a construct coding for the pro- 
tein with N-terminal MGHHHHHHEF tag. The same gene 
was also cloned into pET28a (+) (Kn') via Ncol and Xhol, 
leading to a construct for a protein with slightly modified N- 
terminus (MGSK. . .) and C-terminal tag (LEHHHHHH). 
Mutants of pviiRtsl I were generated in the construct for the 
N-terminally tagged protein variant using the QuikChange 
protocol (28). 



Protein expression 

Expression experiments were done in Escherichia coli 
strain ER2566 (F- \- fhuA2 [Ion] ompT lacZ::T7 gene 
1 gal sulAll A(mcrC-mrr)114::IS10 R(mcr-73::miniTnlO- 
TetS)2 R(zgb-210::TnlO)(TetS) endAl [dcm]) (from New 
England Biolabs). The strain was transformed with plas- 
mids coding for the N- or C-terminally tagged versions of 
the PvuRtslI protein. Cells were grown in LB medium with 
50 |JLg/ml ampicilhn at 37°C to ODeoo of 0.6 and induced 
with between 0. 1 and 1 mM IPTG. The expression was high- 
est when cells were grown for 4 h at 22°C. Cells were har- 
vested by centrifugation and the pellet was stored at -20° C. 
Expression of the selenomethionine version of PvuRtslI 
(with N-terminal tag) was done in methionine auxotrophic 
BL834(DE3) cells in defined media lacking methionine and 
supplemented with selenomethionine (29). 

Protein purification 

Frozen cells expressing PvuRtslI were thawed and resus- 
pended in buffer A (20 mM Tris/HCl pH 7.6, 400 mM 
NaCl and 1 mM PMSF). Cells in suspension were opened 
by sonication and the cell debris was removed by centrifu- 
gation at 1 45000 xg for 30 min. PvuRtslI was purified by 
affinity chromatography on nickel nitrilotriacetic acid (Ni- 
NTA) agarose resin (Qiagen). The protein was eluted in 
a gradient of imidazole (30 mM to 300 mM) in buffer 
B (20 mM Tris/HCl pH 7.6, 200 mM NaCl and 7 mM 
2-mercaptoethanol). Fractions containing PvuRtslI were 
combined and concentrated using Vivaspin concentrators 
(10 kDa MWCO). The protein was purified further by size 
exclusion chromatography on HiLoad 16/60 Superdex 75 
column (GE Healthcare), equilibrated with buffer C (20 
mM Tris/HCl pH 7.6, 200 mM NaCl, 1 mM EDTA and 
1 mM DTT). Fractions containing PvuRtslI were pooled 
and concentrated to 20-24 mg/ml. From 1 liter of culture, 
~7 mg of protein was obtained that appeared pure on a 
Coomassie-stained SDS-PAGE gel. The variant proteins 
were obtained according to the protocol for the wild-type. 

Crystallization 

PvuRtslI was concentrated to 23 mg/ml and then the buffer 
was supplemented with 0.2 M glucose. Crystals were grown 
at 18°C by the hanging drop method. A mix of 2 |jl1 of pro- 
tein solution and 2 |jl1 of reservoir buffer was equilibrated 
against reservoir buffer containing 10% w/v PEG 4000, 
20% v/v glycerol, 20 mM D-glucose, 20 mM D-mannose, 
20 mM D-galactose, 20 mM L-fucose, 20 mM D-xylose, 20 
mM A^-acetyl-D-glucosamine and 0.1 M MOPS/HEPES- 
Na pH 7.5. For cryo-protection, crystals were transferred to 
modified reservoir buffer supplemented with 28% instead of 
20% v/v glycerol. 

Structure determination 

Crystals belonged to space group P4(l)2(l)2 with cell di- 
mensions a — h ^ 62 A., c — 2\\ A. and contained 
one molecule of PvuRtslI in the asymmetric unit. They 
diffracted to approximately 3 A resolution. In the absence 
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of suitable models for molecular replacement, experimental 
phasing was required. Therefore, we grew crystals of the se- 
lenomethionine variant of the protein (containing four sele- 
nium atoms not counting the initiator methionine upstream 
of the histidine tag), which turned out to be better than 
the wild-type protein crystals and diffracted to 2.35 A res- 
olution. Diffraction data were collected at a wavelength of 
0.97625 A, which is just above the selenium edge in energy. 
The structure was solved by the single anomalous diffrac- 
tion (SAD) method. Selenium sites were localized using the 
SHELXD program (30). The SHELXE program (31) was 
then used to generate an experimental electron density map 
by a combination of phasing and density modification steps. 
The density was interpreted automatically using PHENIX 
(32), leading to a nearly complete model of PvuRtslI lack- 
ing only a few loops. The model was completed and im- 
proved manually and refined with the COOT (33) and REF- 
MAC (34) programs (Supplementary Table SI). The final 
model coordinates and the corresponding structure factors 
were deposited at Protein Data Bank (PDB) with the 40Q2 
accession code. 

Assay of the PvuRtslI mutants against T4 phage genome 

T4 phage DNA was purified according to a published pro- 
tocol (35). All the mutants and the wild-type protein were 
loaded on the SDS gel to show the equal concentration of 
the proteins. Approximately 0.1 |jLg (low) and 1 |xg (high) 
amounts of protein were mixed with 240 ng of T4 phage 
genome in the buffer containing 50 mM potassium acetate, 
20 mM Tris-acetate, 10 mM magnesium acetate and 1 mM 
DTT, pH 7.9. The reactions were incubated at 23°C for 20 
min. The reaction mixtures were loaded to 1% agarose gel 
and visualized by Gel Red (Biotium Inc.) staining. 

RESULTS 

PvuRtslI expression and biochemical characterization 

A synthetic gene was used to overexpress versions of 
PvuRtslI with N-terminal or C-terminal hexahistidine tags 
in E. coli strain ER2566. The proteins were purified by affin- 
ity and size exclusion chromatographies. The purified re- 
combinant wild-type proteins with tags on either end, but 
not controls with changes to important residues, were active 
against T4 phage DNA, which is known to contain a large 
number of 5ghmC bases at various distances to each other. 
Although protein activities were at least qualitatively in 
agreement with the literature data, the variant of PvuRtslI 
with N-terminal hexahistidine tag had some other unex- 
pected properties, at least in our hands. While PvuRtslI 
should be a dimer also in the absence of DNA (22), size 
exclusion chromatography with the N-terminally tagged, 
but not the C-terminally tagged, variant of the enzyme 
suggested a slightly lower than expected molecular mass. 
We also observed a high and unspecific affinity for DNA 
(hydroxymethylated, methylated and non-methylated DNA 
are all bound) (Supplementary Figures SI and S2). De- 
spite these undesirable features of the N-terminally tagged 
PvuRtslI, we continued work with this variant of the pro- 
tein, because it yielded well-diffracting crystals, at least in 
the absence of DNA. 



Crystallization and structure determination 

Crystallization of PvuRtslI was attempted either in the 
absence of DNA or with oligonucleotides containing two 
5hmC bases at the appropriate distance. We either avoided 
divalent metal cations or used Ca^"^ ions, which support 
DNA binding, but not cleavage. Finally, we also set up crys- 
tallization trials with oligoduplexes that represent PvuRtslI 
cleavage products (except for the 5'-phosphates), in the pres- 
ence of either Mg^"^ or Ca-"^ ions. All these experiments 
did not yield any diffracting crystals. We concluded that 
PvuRtslI might have a flexible substrate binding site, and 
because we knew that the enzyme accepted 5ghmC contain- 
ing DNA, we tried crystallization in the presence of large 
amounts of glucose. This proved crucial for crystallization 
success. Crystals belonged to space group P4(l)2(l)2, con- 
tained one molecule of PvuRtslI in the asymmetric unit and 
diffracted up to 2.35 A resolution. The structure was solved 
by the SAD method using a crystal of the selenomethionine 
version of the protein. 

Gross structure of PvuRtslI 

The crystal structure reveals that PvuRtslI is a two-domain 
protein (Figure 1). We carried out DALI searches with the 
two domains against the PDB database of protein struc- 
tures (36). The results indicate significant structural simi- 
larity between the N-terminal domain of PvuRtslI and sev- 
eral PD-(D/E)XK endonucleases (DALI Z-scores of 6.9 for 
Ngo0050 from Neisseria gonorrhoeae, 4.9 for V.EcoKDcm, 
4.2 for Hjc, 2.7 for PspGI and 2.5 for NgoMIV), in agree- 
ment with the inclusion of PvuRtslI in a bioinformatic sur- 
vey of highly diverged PD-(D/E)XK restriction endonucle- 
ases (27). We therefore conclude that the N-terminal part 
of the enzyme (residues 1-140) harbors the nuclease ac- 
tivity and henceforth refer to it as the catalytic domain. 
The DALI search also revealed a previously unrecognized 
clear structural similarity between the C-terminal part of 
PvuRtslI and various SRA domain proteins (DALI Z- 
scores for the corresponding domains: 7.0 for SUVH5, 6.5 
for UHRFl, 6.1 for MspJI and 6.0 for UHRF2). There- 
fore, the C-terminal domain of PvuRtslI (residues 141-293) 
will be referred to as its SRA domain in the following. As 
SRA domains recognize modified bases by flipping them 
out of the DNA stack into a pocket of the domain (6-8), 
the PvuRtslI could also be a nucleotide flipping enzyme. 

Structure of the nuclease domain and active site prediction 

PD-(D/E)XK restriction endonucleases are named for the 
(typically) conserved residues in the active site, which are 
found in canonical secondary structure contexts. The core 
folding motif of PD-(D/E)XK restriction endonucleases 
consists of an a-helix that is followed by three consecutive 
P-strands, which together form an antiparallel (3-sheet (of- 
ten with additional strands outside the core motif) (Fig- 
ure 1). The PvuRtslI catalytic domain contains the PD- 
(D/E)XK motif and, as predicted by the bioinformatic 
analysis (27), has candidate active site residues in the ex- 
pected places (with the exception of the lysine) (Figure 2). 
The first catalytic residue, a glutamate in a-helical context 
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Figure 1. PvuRtslI domains and homologs. (A) PvuRtslI catalytic (residues 1-140) and SRA (residues 141-293) domains are in yellow and green, 
respectively. Core elements of the fold are in bright and additional elements in faint color. (B) PspGI restriction endonuclease, a homolog of PvuRtslI 
catalytic domain. (C) SRA domain of UHRFl, a homolog of PvuRtslI SRA domain (6-8). (D) Alignment of the amino acid sequences of PvuRtslI 
catalytic core and homologs. (E) Alignment of the SRA domain sequences in the pocket region. The PvuRtslI-AbaSI alignment is sequence based, all 
other alignments are structure based. 




Figure 2. Active site. PvuRtslI (A, yellow) and Bcnl as bona fide PD- 
(D/E)XK endonuclease with DNA (B, gray) in all-atom representation. 
The thin lines in the PvuRtslI panel indicate the positions of the corre- 
sponding residues and substrate DNA in the Bcnl structure. 

that is not cited in the PD-(D/E)XK consensus, is Glu20. 
The 'PD' aspartate, which coordinates one or both metal 
ions in the PD-(D/E)XK family, is Asp57 (sequence con- 
text ADLL) at the N-terminal end of the second (B-strand of 
the core motif The canonical '(D/E)XK' motif in PvuRtslI 
is changed to 'EID', with Glu68 in the role of the acidic 
residue of the motif and an aspartate in the place of the ex- 
pected lysine residue. There is a lysine residue (Lysl7) else- 
where in the sequence that is spatially in the proximity of the 
active site, but the €-amino group is not in the expected lo- 
cation. Metal ions are not present in the active site because 
crystals were grown in the absence of divalent metal ions. 

Identification of active site residues by site-directed mutage- 
nesis 

Candidate catalytic residues and Glu71, an acidic residue 
without clear function in the PD-(D/E)XK motif were in- 




Figure 3. Effect of mutations on PvuRtslI activity. PvuRtslI and its vari- 
ants were either analyzed for protein purity by SDS-PAGE and Coomassie 
staining (top, 3 \l% per lane) or used to digest 240 ng of phage T4 DNA, 
which was then analyzed by gel electrophoresis in a 1% agarose gel and 
stained with Gel Red. For each variant, two protein amounts (0.1 |xg, left; 
1 (jLg, right) were tested. 



dividually replaced by alanines. The activity of the resulting 
variants was tested with T4 phage DNA as a substrate in 
conditions that lead to complete DNA cleavage by the wild- 
type enzyme (Figure 3). As expected, replacement of Glu20 
or Asp57 with alanine strongly reduced or abolished activ- 
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Figure4. SRA domain pocket. PvuRtslI (A, green) and UHRFl (B,gray) 
in all-atom representation. The flipped 5mC base in the UHRFl pocket 
is observed in the crystal structure. In contrast, the 5hmC base in the 
PvuRtslI pocket has been modeled based on the superposition of the two 
structures (A, thin lines). 

ity. The role of the Glu68 in catalysis could not be directly 
tested because the Glu68Ala variant of PvuRtslI could not 
be made in soluble form in E. coli. Mutation of Glu71 
also abolished activity, and at most residual activity was 
seen when Lysl7 was mutated to alanine. We conclude that 
the mutagenesis experiments support the bioinformatic- 
(27) and crystallography-based identification of active site 
residues. These residues were also noted, but not singled 
out, in an earlier study of amino acid conservation in the 
PvuRtslI family (25). 

Structure of the SRA domain and model for the DNA recog- 
nition 

The C-terminal domain of PvuRtslI has the typical SRA 
domain fold. It is organized around a central, mixed (3-sheet 
with additional smaller (3-sheets and some helices wrapped 
around it. Among the SRA domains, the binding to DNA is 
experimentally best characterized for UHRFl (6-8). Hence, 
we superimposed the SRA domain of PvuRtslI onto the 
UHRFl -DNA co-crystal structure and checked the loca- 
tion of the flipped base (Figure 4). This analysis reveals 
that PvuRtslI has indeed a pocket in the expected loca- 
tion, with sufficient space for 5hmC (Figure 4 and Sup- 
plementary Figure S3). The pocket is formed mainly by 
Pro207, Trp205, Trp215, Asn217 and Glu228. A poten- 
tially flipped base could make hydrophobic contacts with 
Trp215 from one side and Pro207 and Trp205 on the other. 
The 5hmC Watson-Crick edge might be recognized by 
Asn217, Glu228 and Arg208. The modeling also shows that 
there is extra space in the PvuRtslI pocket, so that even 
5ghmC might fit in. Given the uncertainties of the mod- 
eling (only 13% sequence identity between the SRA do- 
mains of PvuRtslI and UHRFl), we cannot pinpoint the 
precise location of the hydroxyl or glycosylhydroxyl group. 



but Trp205, Arg244, Tyr237 and Glu228 are candidate in- 
teraction partners for hydrogen bonding. Interestingly, we 
see an isolated large peak of electron density close to the ex- 
pected location of the glucosylhydroxymethyl group in the 
pocket for the flipped base. Unfortunately, the resolution of 
the structure is not sufficient to decide whether this peak is 
due to a partially disordered glucose molecule (which would 
explain why it was essential for crystallization). 

Tests of nucleotide flipping and the SRA pocket function 

The detection of an SRA domain in PvuRtslI strongly sug- 
gests that the protein flips the 5hmC or 5ghmC bases in 
its target sequence for detailed scrutiny, as suggested for 
UHRFl (6-8) and SUVH5 (38) (based on crystallographic 
evidence) and for MspJI (10) (based on modeling). We first 
attempted to directly demonstrate nucleotide flipping using 
DNA with the environment sensitive fluorophore pyrrolo- 
cytosine (pyC) instead of the 5hmC in either one or both 
DNA strands. Preliminary experiments showed that pyC 
and 5hmC could both direct the cleavage of DNA oligodu- 
plexes. With the 5hmC substrate (and a mixed 5hmC/pyC 
substrate) we observed two cleavage sites, one in the ex- 
pected position and another one closer to the 5hmC base. 
With the pyC/pyC substrate, only the non-canonical cleav- 
age closer to the base was observed (Supplementary Figure 
S4A). The pattern was not affected by the location of the 
histidine tag at either N- or C-terminus of PvuRtslI. Un- 
fortunately, there was no significant increase of pyC fluo- 
rescence when PvuRtslI was added in the absence of diva- 
lent metal cations (Supplementary Figure S4B). This result 
is consistent with pyC (and by impHcation 5hmC or 5ghmC) 
not being flipped. However, it could also be due to the non- 
canonical cleavage for pyC substrates, to efficient quenching 
of the fluorescence by the SRA domain or to unintended 
tertiary structure of the oligoduplex (which eluted from a 
gel filtration column in several peaks). 

As the pyC fluorescence experiments were inconclusive, 
we carried out mutagenesis experiments. According to the 
crystal structure Trp205, Trp215 and Glu228 might con- 
tribute to shaping the walls of the PvuRtslI pocket. In 
contrast, Arg208 contributes to the pocket primarily by its 
main chain, but not by the side chain, which points away 
from it. We separately changed all four residues to alanines. 
The Trp205Ala substitution made PvuRtslI insoluble, but 
the other variants could be assayed. The Arg208Ala muta- 
tion was only mildly compromised in its activity, but both 
the Trp215Ala and Glu228Ala mutations lost activity com- 
pletely (Figure 3), as one would predict if the pocket of 
the SRA domain was required to accommodate the mod- 
ified cytosine base. We conclude that the mutagenesis ex- 
periments support the hypothesis that the SRA domain of 
PvuRtslI flips the 5hmC or 5ghmC bases in a substrate. 

DISCUSSION 

A variant PD-(D/E)XK domain in PvuRtslI 

The classification of PvuRtslI as a PD-(D/E)XK endonu- 
clease is consistent with an earlier prediction (27) and sup- 
ported by biochemical findings and the structural data. As 
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Figures. Models for thePvuRtslI dimer bound to target DNA. (A) Bent DNAandPvuRtslI with linker conformation observed in the crystal. (B) Straight 
DNA and PvuRtslI with adjusted interdomain linker. Nuclease domains (with modeled metal ions) were placed on the DNA based on the superposition 
with the NgoMIV-DNA complex (37). The binding mode of the DNA to the SRA domain is modeled after the UHRFl-DNA complex. 



noticed earlier and confirmed in this work, PvuRtslI is ac- 
tive in the presence of Mg^^, but not Ca-"^ ions (Supplemen- 
tary Figure S5). This is typical for PD-(D/E)XK restric- 
tion endonucleases, but would not be expected for HNH 
(also called P(3a-Me), GIY-YIG or phospholipase like nu- 
cleases. The assignment is further supported by the pres- 
ence of the core a(3P(3 folding motif in the catalytic domain 
of PvuRtslI and by the presence of candidate active site 
residues (with the exception of the lysine) in their expected 
locations. 

Modification specific SRA domain, unspecific nuclease do- 
main 

PvuRtslI cleaves DNA approximately 11-13 nucleotides 
from 5hmC or 5ghmC bases. Taking the typical distance be- 
tween adjacent base pairs as 3.4 A (as in ideal B-DNA) and 
assuming straight DNA, the cleavage site is expected to be 
30-45 A away from the modified base. This distance is com- 
parable to or larger than the largest linear dimension of the 
PD-(D/E)XK domain and therefore makes it unlikely that 
the nuclease domain is directly involved in sensing the DNA 
modifications. This suggests that the 5hmC or 5ghmC bases 
are 'read' by the SRA domain, in agreement with earlier 
findings for homologous domains of UHRFl and MspJI, 
which are specific for modified DNA bases. Unfortunately, 
the isolated domains of PvuRtslI could only be expressed 
in insoluble form, and thus this model of PvuRtslI activity 
could not be directly tested biochemically. 

A model for PvuRtslI catalytic domain-DNA complex 

Based on prior co-crystal structures of PD-(D/E)XK do- 
main restriction endonucleases with substrate DNA, such 
as the NgoMIV-DNA co-crystal structure (37), it is possi- 
ble to place the PvuRtslI nuclease domain on the DNA in 
a 'productive' orientation (Figure 5). In fact, such modeling 
results in protein-DNA clashes only in the region of a sin- 
gle helix of the enzyme, which could move to the DNA ma- 



jor groove upon complex formation. The predicted DNA 
binding mode is also consistent with calculations of the sur- 
face properties of the protein, once the metal cations that 
are expected in the active site of a PD-(D/E)XK endonu- 
clease (but were absent from the crystallization buffer) are 
included in the calculation (Supplementary Figure S6). As 
the stagger between the two single strand cuts is known (2- 
nt 3'-overhangs in the product) (25), the modeling also de- 
fines the relative orientation of the PD-(D/E)XK domains 
with respect to each other. In support of the model, result- 
ing clashes between the nuclease domains appear resolvable 
by local rearrangements (Figure 5). 

In agreement with the biochemical data for the PvuRtslI 
variant used in this work, we do not find the predicted 
dimer in the crystal. There is a crystallographic neighbor 
of the single molecule in the asymmetric unit in roughly 
the expected position, but its orientation is completely un- 
like what one would expect for a productive dimer. More- 
over, the PISA server (39), which analyzes protein-protein 
contacts in a crystal, scores none of the interfaces in the 
PvuRtslI crystal as biologically relevant and classifies the 
protein as a monomer. The protein used in this work differs 
only by the N-terminal histidine tag from protein character- 
ized previously (produced with an intein tag cleaved off dur- 
ing purification) (22). A comparison of the gel filtration pat- 
terns of N- and C-terminally tagged PvuRtslI suggests that 
the N-terminal tag slightly destabilizes the dimer. Thus, it 
appears that the tag, despite not being located in the dimer- 
ization interface, together with crystallization forces could 
be responsible for the unusual monomeric state of the pro- 
tein in the crystal. 

A model for the full-length PvuRtslI dimer with bound DNA 

Based on the co-crystal structure of the UHRFl SRA do- 
main with DNA (6-8), we can also model DNA bound to 
this domain of PvuRtslI. The next step then is to connect 
the DNA duplexes bound to the SRA and PD-(D/E)XK 
domains of the enzyme. Reassuringly, the biochemically 



Nucleic Acids Research, 2014, Vol. 42, No. 9 5935 



predicted number of base pairs is very suitable to bridge 
the distance between the cleavage sites and the positions 
of the modified bases on the two DNA strands, provided 
that the DNA is sufficiently bent between the two regions 
(Figure 5A). Alternatively, a plausible model can also be 
buih if PvuRtslI interdomain linkers are taken to be flexible 
and the domain orientations are adjusted so that the pro- 
tein binds straight B-DNA (Figure 5B). We also note that 
the predicted DNA binding sites of both domains are qual- 
itatively supported by calculations of the PvuRtslI electro- 
static surface, which has clear patches of positive charge in 
these regions (Supplementary Figure S6). We presume that 
these parts of the protein are involved in interactions with 
the negatively charged phosphodiester backbone of DNA 
and account for the observed high unspecific DNA affinity. 

The modeling data appear compatible with a 
Fokl/TALEN-like (40) model for PvuRtslI activity 
(apart from the order of domains, catalytic domain is 
N-terminal in PvuRtslI and C-terminal in TALENs). 
According to this view, the SRA domains of the PvuRtslI 
dimer act as the modification specific counterparts of the 
sequence specific TAL domains, and the nuclease domains 
play the role of the Fokl domains of the TALEN pair. The 
higher activity of PvuRtslI against substrates with two 
rather than one modified base could be an avidity effect, 
or might be attributed to an activating 'kissing interaction' 
between the nuclease domains (41). 
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